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lL. INTRODUCTION 


The U.S. Navy spends over 2 billion dollars a year on training. Training costs 
are rising, but the Navy docs not have a clear understanding of why. A multitude of 
factors affect cost, however, we do not know what those factors are. To understand 


this problem, Ict us develop a general conccpt to work from. (See Figure 1.1.) 


A 


S = the set of all events that have the power to 
affect training costs 


= {events that have occured} «— factors 
= {events that have not occured} 
S= (A U B) () = (A Nn B) 





Pigiretlelee bie concept. 


Let us identify events that have the power to affect training costs. We will call 
Eisescinoe Secondly, lect Us divide the sct S into two mutually exclusive sets A and B. 
Let A be the set of all events that have occured and B be the sct of all events that have 
not occured. Our goal is to find events that belong to sect A. Set A will be labeled 
factors since by definition, a factor is a contributing clement that brings about a given 


result. In our case, the end result is rising training costs. 


A. PROBLEM STATEMENT 
Why is the cost of training rising? To answer this question, we divided the 
problem into several subproblems. We sclected three subproblems to be research 


questions for this study. 


© Has the length of basic training increased? 

© fas attrition increased? 

© Has the amount of specialized training increased) 

Our goal is to identify events that affect training costs. Imbedded within our 

problem statment are three events. These events are: 

A. The length of basic training has increased. 

B. Attrition has increased. 

C. The amount of specialized training has increased. 
Can we classify any of these events as factors? Or stated differently, “Have any of 
these events occured?” If event A, B, or C occured, then at least one reason Wille as: 


to explain the rise in training cost. 


B. OBJECTIVES 


This study attempts to answer three questions. Let us transform those questions 


Into statistical hypotheses. 


The length of basic training not has increased. 


The length of basic training has increased. 


Attrition has not increased. 


Attrition has increased. 


The amount of specialized training has not increased. 


The amount of speciahzed training has increased. 





These three hypotheses form the basis of this study. Statistical methods will answer 
these questions by either accepting or rejecting the null hypothesis. The objectives of 
(his thesis are: 

l; ~ Vest ali three hypotheses: 


2. Accept or reject cach event as @ lactor tliat increases coce 
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Il. HISTORY AND BACKGROUND 


The Chief of Naval Operartions (CNO) expected training costs to fall when 
retention increased in the early 1980's. However, a decrease did not occur. The Center 
for Naval Analyses (CNA) was tasked to examine the relationships between training 
costs and retention. CNA formulated some general reasons why training costs might 
change. They set out to confirm those reasons by using information stored in their 
historical data files. From those data files, they provided a small data base for this 


study. 


A. DATA BASE DESCRIPTION 

The Navy has 101 enlisted rating codes. CNA’s data set contains information on 
every enlisted rating. The data base used for this study contains information on only 
three enlisted ratings. These ratings are: 

AT = Aviation Technician 

AW 

AX 


We selected these ratings for the following reasons. This author, in conjuction with 


Aviation Anti-Submarine Warfare Operator 


Aviation Anti-Submarine Warfare Technician 


CNA, expressed an interest to examine the aviation community. Next, we decided to 
observe two closcly related technical ratings from a squadron’s maintenance 
department, so we selected the AT’s and AX’s. Lastly, we wanted to observe a rating 
from the squadron’s operations department, so we selected the AW’s. 

The second point that characterizes this data base is that it 1s a selected sample 
from the three ratings. Given the record has a rating code of AT’, ‘AW’, or ‘AN’, the 
PecOnam ss screening criteria consists of all records that are coded 
‘SG = School Guarantee’. We will say more about this criteria in the next section. 
Figure 2.1 provides a Venn diagram concerning the selection process for records that 
entered this study’s data base. Corliss [Ref. 1] describes the original data set. Sce 


Appendix B for a detailed layout of this data base. 


1] 


CNA‘s Data Set 


A = (AT U AW U AX) 
Data Base = (A f B) 





Figure 2.1 Record selection proccss. 


B. EXPECTED TRAINING PATH 

For the first enlistment period, an individual's expected career path follows that 
which is portrayed in Figure 2.2. An individual reccives indoctrination at Recruit 
Training Command (RTC). This command is commonly known as Boot Camp. The 
recruit proceeds to A-school upon completion of Boot Camp. A-school provides the 
recruit initial skills. Upon completion of A-school, the individual advances to the flcet. 
The individual will receive more school based training from C-schools and F-schools, 
While serving productively in the fleet. C-schools and F-schools provide an individual 
with advanced skills and flect skills respectively. 

Let us return back to the data base selection criteria. A “School Guarantee’ is a 
clause written in the recruit’s enlistment contract that assures the recruit will proceed 
directly to A-school upon completion of Boot Camp. Without the ‘School Guarantee’, 
a recruit may be sent directly to the flect from Boot Camp. This study is strictly 
concerned with individuals who follow the expected training pipeline as depicted in 


Fieure 2.2. 


Ene Gleb CAREER PATH 


Boot Camp A-School 


Training Period Productive Period 


4, 
Ps 


While serving in the fleet, a person will receive training from 
C-Schools and F-Schools. 





Figure 2.2  First-term enlistment milestones. 


C. LIMITATIONS 

As discussed earlier, the Navy has 101 enlisted ratings. However, the data base 
used to support this study has only three enlisted ratings. Secondly, these individuals 
are Selected, not random. Thirdly, we are observing the performance of each group 
over time. The time frame is dependent upon the rating we are observing. The time 


frames available for study are: 


al 81 82 83 84 
AW 77 78 79 80 81 82 83 84 
aX SEM S2. S584 


MinewreasOn for the differences in time frames is due to the fact that prior to 1981, 


school guarantees were not given out to individuals desiring the AT or AX ratings. 


be SCOPE 
The scope of this study is restricted to the first enlistment period. (See Figure 
2.3.) The following subsections describe the measures used in the analysis. Limitations 


and definitions are listed to set the foundation for each hypothesis test. 


1. Leneth of Basic Training 
The data base does not provide us with a way to calculate the exact time a 
person spends in basic training, however we have another measure. This measure 1s 
called ’time to get rated’. (See Figure 2.4.) For each individual, we have two dates. 


These dates are defined as follows: 


mo 
“ae 


ENLIS? MEN | PeiNi@Ds 


ond 


EAOS, EAOS, 


ke 
PAY ENTRY BAS Eee: 
END OP ACTIVE OBEIGAIED Sek vice 





Figure 2.3 Enlistment Periods. 


¢ PEBD = (Pay Entry Base Date) This is the date a person enters the Navy. This 
date is used for accounting purposes. 
¢* RD = (Rating Date) This is the date a person is designated into one of the 


Navy's occupational specialties. 


training period productive period 
PEED RD 


PEBD = Pay Entry Base Date 
RD = Rating Date 
EAOS End of Active Obligated Service 





Figure 2.4 Initial Training Period. 


A person gets rated upon completion of A-School or shortly thereafter. As seen in 
Iigure 2.4, time to get rated 1s defined as the difference between a person’s rating date 
and pay entry base date. Time to get rated will be used to measure the length of basic 


training. 


As outlined in Figure 2.3, this study is restricted to the first enlistment period. 
This time frame is normally 48 months. The first half of the enlistment period is 
defined as the Basic Training period. Using this definition, our study of basic training 


will be restricted to the first 24 months of the enlistment period. (See Figure 2.5.) 


BASIC TRAINING TIME CONSTRAINT 


(inonths) 


St 


enlistment period 


Sl | OiNeeiivite CONST RAINY 
(months) 


N(t) = number of survivors 





Analysis will be performed within the time constraint denoted by: 


—| l<— 


igliter2-o es line constraint. 


2. Attrition 
Percent losses and attrition rates are the measures used to compare year 
groups. Given a year group, percent loss is defined as the number of individuals that 
leave the Navy divided by the number of individuals that enlisted in the Navy. 
Attrition rate 1s defined as the number of individuals that leave the Navy per month. 
We restrict our analysis to the first 24 months per year group. Our goal is to measure 
attrition in the training environment and not in the operational environment. (Sce 


Figuie 29) 


3. Amount of specialized training 

The Navy’s C-schools provide individuals with advanced/specialized skills. 
Upon completion of a C-school course, the individual receives a Naval Enlisted 
Classification (NEC) code. NEC codes supplement the enlisted rating structure by 
identifying particular skills in more detail than the occupational or rating structure. 
The navai terminology 1s simply this: 

¢ RATING = individual’s occupational specialty 
°@ NEC = individual’s occupational subspecialty 

As an example, see Table I. Joe Sailor’s occupational specialty is Aviation 
Technician. Joe Sailor’s occupational subspecialty is: 

- Aircraft Radar Altimeter IMA Technician 
- Aircraft Doppler Radar IMA Technician 
- Aircraft Navigation Computers IMA Technician 
In general, his occupation deals with aircraft navigation systems. 

We measured the amount of specialized training a year group received by the 
number of NECs received. This measurement took place during the second and third 
year of service. (Secrieure 270.) 

The reasons we defined the second and third year of service as the window for 
analysis are threefold. One, if an individual follows the expected training pipeline, the 
first year 1s spent in Boot camp and A-school. Since the individual is not enrolled in 
C-school during the first year, the*éxpected number of NEC’s earned will be Zeno. 
Two, if we use the entire time period spanned by the data base, year group 78 will have 


had more time to aquire NEC codes than year group 80. We need to ensure each year 


'The Naval Aviation Maintenance Program has three levels of maintenance. The 
levels are operational, intermediate, and depot. IMA is known as intermediate level 
maintenance. 


16 


ee ere I 
OEE WeOm > RATING AND NEC CODES 
RATING DESGKIETION 


Aa Aviation Pechnician 


NEC Be CRI erie. 


Aircraft Radar Altimeter IMA Technician 
Aircraft Doppler Radar IMA Technician 


Aircraft Navigation Computer IMA Technician 


6605 
6606 
6608 


a7 78 io $0 81 
REA 


Boxes represent the time frame a year group will be under cxamination 
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group has exactly the same time length and the same time period in their respective 
careers to accumulate NEC codes. Three, we stated earlier that our analysis will be 


restricted to the first enlistment period. 


i, METHODOLOGY AND ANALYSIS 


A. BASIC TRAINING 


1. Time to get rated: Is there a trend? 
Has the time to get rated changed over time? To answer this question, we 
define the Two Factor Analysis of Variance (ANOVA) model as follows: 





MODEL: Yk eas B. ar 4 as (Br); + fiik 


oSDICES: i = rating 


| = Year group 

k = k") individual from group (1,}) 
Yijk = number of months the k"® individual from group (1,}) took to get rated 
LL = overall average time to get rated (grand mean) 
p. = additional time it takes an individual from rating i to get rated 
7; = additional time it takes an individual from year group ] to get rated 
(Bt); = interaction term 
fiik = error terms that are lid N(0,67) 


The goal is to test the t vector. Is the mean time to get rated from one year 
group statistically different from another? We answer this question by using a 


statistical test. The hypothesis test and decision rule are listed in Table I. 


Ne 


Tepe le 
TWO FACTOR AO y opewee dE Sl Sales. 


Oo 770° oo See 


t - oe ee 


eg 84 


H,: The mean time to get rated has remained constant. 


,;; Not all the means are equal. 


If F* S F(.95, 7, 2690) then conclude Hy 
If F* > F(95, 7, 2690) then conemnade H, 


The other terms in the model, pt, B, and (Bt), are considered nuisance factors. 
Our goal is to account for their effects and block out their contribution. This prevents 
the estimate of 6” from being inflated. The main goal is to test for differences among 
year groups. 

Table III lists the results of the test. All main factors are significant. Look at 
the table results concerning the Tt vector. It is statistically significant at the .0OOI level. 
It is highly unhkely that the t’s are equal. The P value (.0001) supports the alternate 
hypothesis, not all the means are the equal. Using our decision rule, since I'* > F, we 
accept the alternate hypothesis and conclude a trend exists. “The time to get rated has 
changed over the years.” 

Figure 3.1 is a scatter plot of the entire population. A couple of interesting 
things are worth noting. 

e Outliers are located above the mean, none below. 
° On the average, Year Group 84 took the least amount of time to get rated. 


e The dispersion about the population means ts smallest within Year Group 84. 
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Notice the presence of outlicrs on the high side but none on the low side. As 
expected, there 1s some minimum time required to get rated but no upper bound. We 
will truncate all valucs of Y greater than 24 months tn the ensuing analysis. The 
reasons are threefold. One, as stated in the original sct of objectives, the focus on 
Basic Tramning will be restricted to the first two years of service. Two, a sect of unusual 
circumstances caused these individuals to take a substantial amount of time to get 
havea. Ihey have detoured from the expected training pipeline and we are not 
interested in these individuals. Three, truncating the outliers will stabilize the variance 
for future ANOVA tests. Only 25 data points will be lost. This amounts to .009 or 
.9% of the observations. Censoring these data points should not affect future tests. 

Now, let us look at 1984. Tables IV and V display Tukcy’s pairwise 
comparisons for all year groups. All pairwise comparisons with year group 84 are 
Statistically significant. Since the average time to get rated by Year Group 84 is least 
among all other year groups, we will delete that group from the ensuing analysis. No 
further analysis need be done to that year group. 

ipcUtimary, this first test establishes a trend. The time to get rated has 
changed over the years. Secondly, the time to get rated has decreased from 1983 to 


1984. Let us investigate what happened prior to 1984. 


2. Has the time to get rated increased or decreased through 1983? 

Hise itst test neveaied the presence of a trend, The test also pointed out that 
Eiieetime to ect rated decreased from 1983 to 1984 for all groups. To see what 
happened prior to 1984, we will test each group separately. We will follow the 
mei mocdology uscd in Neter, Wasserman, and Kutner [Ref. 2: Sec. 17.2]. The objectives 
are: 

eM stimate the mcan trme to get rated for cach year group. 

Sm rest thie means for statistical diflercnee. 

e Rank the means using a paired compartson test. 
Our analytical tool to test the means for statistical differences is the Single Factor 
ANOVA Model. The Kruskal-Wallis (KW) nonparamctric test for equal means will be 
used as a backup test. Then, given the ineans are different, Tukcey’s paired comparison 
festervill be used to examune tle nature of the differences. Based on the paired 


comparison test results, we will rank the means. 
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Comparisons significant at the 0.05 level are indicated by 
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Boke aha 
7ae ee eg 


q(.95; 7, 2683) = 4.290 
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Critical value of studentized range 


eee 1 SCID) 


Tukey's paired comparison confidence interval: 


((1/nj) + (1/n)IMSE 


(D) = 
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J 
DN DIGES: 1 = year group 
j = j* individual from year group i 
Vij = number of months the jth individual from rating 1 took to get rated 
It = ovcrall average time to get rated 
T; = additional time it takes an individual from year group 1 to get rated 
Cc: = error terms that are iid N(0,07) 


The hypothesis test and decision rules associated with the Analysis of 
Variance model and the Kruskal-Wallis nonparametric test are listed in Table VI. 
Test results, tables, and figures that support this discussion are grouped 


together. They are laid out in the following manner. 


AT Picure see Data Analysis Graphs 
‘Table Vill ANOVA/KW test results 


Figures Tukey's paired comparison test results 


AW Figure 3.4 Data Analvsis Graphs 
Table VIII ANOVA/KW test results 


Figure 2. Tukey's paired comparison test results 


AX Figure 3.8 Data Analysis Graphs 
Table x ANOVA/KW test results 


Picures.7 Tukey’s paired comparison test results 


Figures 3.2, 3.4, and 3.8 provide a graphical summary of the data sets. Tables VII, 
VIil, and IX provide the ANOVA test results and the Kruskal-Wallis test results. 
Figures 3.3, 3.5, and 3.9 provide Tukey’s paired comparison test results. These figures 
display a graphical ranking of the means and a confidence interval for the difference in 


means. Specific results are listed in the figures and tables. We summarize our findings. 


TAB EE yi 
Slo wEeeee | OR AN OY Asis POTRESIS TEST #1 


H: 


: The mean time to get rated has remained constant. 


HNO ailetne tieansiane equal. 


-ANOVA- 
If F* S F(.95, v,, v5) then conclude My 
ey V5) tienconemde Tt 


-KW- 


Ix Ey S x7(.95, Vv) then conclude H, 


If xe Rwy > 47(.95, v) then conclude H, 





e For all three ratings, the Analysis of Variance test and the Kruskal-Wallis test 
results were highly significant. The probabihty that the means are equal 1s 
ammestezero. Im all three cases we reject the null hypothesis and accept the 
alternate hypothesis. We conclude: “The mean time to get rated has changed over 
the years.” 

Saumoimenee;. | ssclectces, tlie timesto cet rated is best described as no difference 
between year groups 81 and 82. However, year group 83 took an extra 1.5 
months to get rated. There is a slight upward trend. 

See Or the AW selectees, the time to get rated is best described as cyclic. The mean 
time to get rated is highest in 1977. Over the next two years, the mean time to 
get rated drops to its lowest in 1979. After 1979, the trend is upwards for the 


next 4 years. 


oy 


* For the AX sclectecs, the trend is LU shaped. The mean time to get rated "dron. 


in 1982 and nisesanel oso 


B. ATTRITION 

Has attrition increased over the years? If the answer ts yes, then attrition is a 
factor causing training costs to rise. A simple relationship exists between attrition and 
training costs. If the attrition rate is high, then the Navy must train more people to 
fulfill quotas. Increasing the number of people to be trained raises the training cost. 

Two methods are used to answer the question. The first method uses the actual 
percent losses. The annual percent losses are inputs into tlre Cox and amma 
nonparametric test. The test determines whether an increasing trend exists. The 
second method uses a regression approach. Attrition rates are estimated using a 
nonlinear regression model. These rates serve as inputs into a simple linear regression 


model. 


1. Percent Losses: Is it rising? 


percent loss from school 1 and year group ] 










2 
I 


H 


number of individuals that left the Navy from group (i,j) divided by the 
number of individuals that started in group (1,j) 

Percent losses were calculated twice, once for Boot Camp and once for 
A-school. We examined the sequence of numbers for an upward trend by using the 
Cox and Stuart nonparametric test. Conover [Rel 3: pp: 133) outlines “ieee 


procedures in detail. 
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TiME TQ GET RATED 
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MONTHS 





YEAR GROUP 


Rigtincs)7 ae ale line tO get rated. 
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ANOVA TEST RESULTS 























-PERCENTILES- 

1 ni “%o ita uy oF 02> 0.50 be 
1 226 = «20.3. 16.385 3.5600 14 15 19 
82 S521 469 16.785 217153 15 17 19 
$3 365 32.8 18.321 3.1104 17 19 20 

1112 100.0 17.208 =~ 3.1130 15 18 19 
CLASS” UIE WIEIEs VALUES 
T 3 $1 $2 $3 
S df SS MS i PRewae 

Model 2 698.0855 349.0428 37.92 0.0001 

Error 1109: 10206.9280 9.2037 

Total 1111 10905.0135 

Re C.V. / MSE hy 
0.0640 17.6302 3.0338 17.2077 


KRUSKAL-WALLIS NONPARAMETRIC TEST FOR EQUAL MEANS 
df oe Senet coo) 


2 [03a 0.00 





F(.95,2,1109) = 3.00 ¥4(.95,2) = 5.99 


ieee EES 


1 n: Haat 


8 1 226 ~=:16.385 














§2 a2 16.785 
§3 365 [S321 
7T9 SO 51 §2 $3 84 
Meat GROUE 
(1, }) oles q - ci ee SG 
83-82 1.050 1.536 227 + 
33.8) ees 1/936 7538 . 
§2-83 -2.022 -1.536 - 1.050 oe 
§2-§ 1 -0.167 0.400 0.967 
S 1-83 -2.538 -1.936 -1.333 
$1-82 -).967 -Q.400 0.167 


o df MSE 


0571109 220 





Means boxed together are not statistically different 
Comparisons siemiicant at the 0:03 level are indicated by “***’ 
Critical value of studentized range = q(.95; 2, 1107) = 3.319 
Tukey’s paired comparison confidence interval: D + Ts(D) 
mmere, DO = (m+ ty) > (nh + ti) 

ee (2)q 

s-(D) = [(I/n;) + (1/n,)|MSE 
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Figure 3.4 AW: Time to get rated. 
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AW: TIME TO GET RATED 
ANOVA TEST RESULTS 




















ee Ol NTL ES- 
er 70 5.6 lee id 3.1249 les 18 19 
78 99 7.9 java 3.7743 2 lS 17 
79 16] 123 13.199 3.7895 10 13 16 
SO 209 16.6 14.986 3.9559 13 ls is 
$1 174 1335 270 3.8829 12 13 a 
$2 303 24.0 14.703 See 8 | 13 les 17 
§3 243 19.3 16.988 3.2721 14 1s 19 
1259 100.0 hse 56 3.3781 12 15 Re 
CLASS LEVELS VALUES 
T ii 17 Tod 9 S0sI 82 83 
S df S55 MS ee Lin 
Model 6 1975.1705 329.195] 25.65 0.0001 
Ecror eZ 16066.9375 12.8330 
Total L258 18042.1080 
R? C.V. / MSE [hy 
0.1095 23.7939 B53 sO 50 


KRUSKAL-WALLIS NONPARAMETRIC TEST FOR EQUAL MEANS 
i jhuae IRS RAGES) 


6 144.45 





0.00 


F(.95,6,1252) = 2.10 7(.95,6) = 12.59 
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AW eee EG@ilet 5 


Merc 


17.214 
14.414 
e199 
14.986 
14.270 
14.703 
16.988 





76 77 78 79 #80 81 82 83 84 
YEAR GRODP 


Means boxed together are not statistically different. 


Figure 3.5 AW: Tukey’s paired comparison test results #la. 
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Comparisons significant at the 0.05 level are indicated by ’***’ 


q(.95; 7, 1245) = 4.176 


Critical value of studentized range 


Tukcey’s paired comparison confidence interval: 


D + Ts(D) 


me Ue Cy ( [to t;) 
T = (1/2)q 


D 


where: 


s°(D) = [(1/n)) + (1/n;)|MSE 


Tukey’s paired comparison test results #1b. 


Figure 3.6 AW: 
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Comparisons significant at the 0.0 


q(.95; 7, 1245) = 4.176 


Critical value of studentized range 


D sD) 


Tukev’s paired comparison confidence interval: 
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where: 


(inj) + (1/n)|MSE 


Tukey s paired comparison, testyresmts + 1c. 
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-PERGCENTIBES- 
1 nj % jl Ar vs 6; 0.25 0.50 O73 
§ 1 33 Foie | 17.667 Seo? 16 1S 20 
§2 139 55:4 16.863 DIOFS 16 L7 ES 
83 19 Sa [s45] 2.7496 17 19 20 
Zs] 100.0 ee? Ss 3.0084 16 1S 20 
GRASS BiEviates VALUES 
c 3 Simo wss 
S df SS MS ee PRs 
M odel 2 [33.1718 66.5859 oS 0.0005 
Error 248 2129. 4577 S505 
Total 250 2262.6295 
he C.V. WISE [ly 
0.0589 16.7654 2805 (7.473) 


KRUSKAL-WALLIS NONPARAMETRIC TEST FOR EQUAL MEANS 


2 23.846 0.00 





F(.95,2,248) = 3.00 ¥7(.95,2) = 5.99 


WSOP IEECIEES 




















1 nj LU ae vs 
S| 33. 17.667 
82 [5a G.865 
83 79 «18.481 
79 SO 8 1 82 83 84 
bees (GuMenaiy 
(i, }) Clip TF = e er Sir 
83-82 0.644 1.618 Zo 
§3-8] -0.018 0.814 2.246 
$3.83 2°59] -1.618 0.644 srs 
82-8 | -2.14] -0.803 55 
81-83 -2.246 -0.814 0.618 
81-82 -0.535 0.803 2.14) 





OS 248 §.5865 


Means boxed together are not statistically different 
Comparisons significant at the 0.05 level are indicated by ’***’ 
Critical value of studentized range = q(.95; 2, 246) = 3.335 
Tukey’s paired comparison confidence interval: D + Ts(D) 
poere. b= (it t= (jt + 5) 

T = (1//2)q 

s*°(D) = [(1/n;) + (1/n,)|MSE 


Figure 3.9 AX: Tukcy’s paired comparison test results #1. 
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Cox and Stuart's test is designed to detect trends in a sequential data set. Let 

Ny,+++, Xp, be a sequence of random variables. The test procedures are: 
1. Group the random variables into pairs [(X),N,,4 1),- ++» (&g, Xy) | where 
m = n/2. 
2. Replace each pair with a (+) if (X,,4; > Xj) ora(—) if (X43; < X)). 
Let n €qual the mumber or (a seats es elect ae equal the number of (+ )’s 
and T equal the number of (— )’s. 
4. Set up a binomial test with parameters (n, .5). 
5. Accept or reject the null hypothesis using the test statistic oe 
Notice the arrangement of random variables. If an upward trend exists, the smallest 
numbers will be near the beginning of the sequence and the larger numbers near thie 
end. The design helps to display this increasing trend. If an upward trend is present. 
the number of (+)’s will be greater than the number of (—)’s. If a truly random 
pattern existed, the number of (+)’s should be approximately equal to the number of 
(= \s(l = 2h a) 

To test whether the number of (+)’s is significantly different than the number 
of (— )’s, we use the binomial test with parameters (n,p) where n = TT + T7 and 
p = .d. 

We tested all data sects using the above procedures. Figures 3.10 through 3.21 


provide the specific results. They are laid out in the following manner. 


AT Figure 3.10 Percent esses fromppoer Camp 
Pisure 3.11 Cox amdestuart Tcstavesults 
Figure 3.12 Percent Losses from A-School 


Figure 3.13. Cox aidssivart Festaecuits 


AW Figure 3.14 Percent Losses from Boot Camp 
Figure 3.15 Cox and Stuart PesieResuits 
Figure 3.16 Percent Losses from A-School 


Figure 3.17 Cox and Stan Vesteesuits 


AX Figure 3.18 Percent Losses from Boot Camp 
Figure 3.19 Cox and Sita, Tesiixesurts 
Figure 3.20 Percent Losses from A=school 
Figure 3.21 Cox and Siuare Tesmivesuits 
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Pieures 3.10, 3.14, and 3.18 graphically display the percent losses from Boot Camp. 
Similarly, Figures 3.12, 5.16, and 3.20 graphically display the percent losses from 
Seecioo!, figures 3.11, 3.15, and 3.19 provide the Cox and Stuart test results for data 
eels pertaining to Boot Camp. Simularly Figures 3.13, 3.17, and 3.21 provide the Cox 
and Stuart test results for attrition losses in A-school. In all cases, we accepted the 


null hypothesis; Attrition is not increasing. 
2. Attrition rates: Is it rising? 


What is the attrition rate during basic training? 
Is the attrition rate higher this year than last year? 

These two questions form the basis of this subsection. Two models are 
presented. The first model is used to estimate the attrition rates. The second model 


Mererinines if the rates are increasing. 


a. Estimation of attrition rates 





MODEL: N;(t) = nye"Aijt + oy 


] 

MN DICES: 1 = rating 

] = year group 
Ni) = the number of survivors from group (1,]) at time t 
oN = the number of individuals from rating 1 and year group ] 
oo hiy! = the probability an individual from group (1,}) survives to time t 
hii = attrition rate for group (1,}) 
t = time 
a = error terms that are uid Nor) 
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Figure 3.10 AT: Percent losses from Boot Camp. 
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: Attrition is not increasing. 


: Attrition is increasing. 
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Since T* falls in the acceptance region we 
accept II). Altrition ts not increasing. 


ic tiemstt eel; COX and Stuart Test iesults #1. 
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Figure 3.12 AT: Percent losses from A-school. 
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I], : Attrition is increasing. 





Attrition is not increasing. 
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Since T7 falls in the acceptance rcgion we 
accept Hy. Attrition is not increasing. 
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Figure 3.14 AW: Percent losses from Boot Camp. 
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H,: Attrition is not increasing. 


Ht, : Attrition is increasing. 
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Since T* falls in the acceptance region we 
accept Hy. Adirition is not increasing. 


Picuger io) sy seemand stwarnt lest Kkesults #1. 
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Figure 3.16 AW: Percent losses from A-school. 
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: Attrition 1s not increasing. 


: Attrition 1s increasing. 
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Since T* falls in the acceptance region we 
accept Ip. -tirition ts not increasing. 
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Figure 3.18 AX: Percent losses from Boot Camp. 
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H, : Attrition is not increasing. 


H,: Attrition is increasing. 
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Since T* falls in the acceptance region we 
accept [1,. Alirition is not increasing. 


Pigmie 3.19) AX, we ox and Stuart lest Kesuits #1. 


pl 


AX SELECTEES 


20 50 40 50 


PERCENT LOSS FROM A-SCHOOL 
10 

















80 81 82 83 84 85 
YEAR GROUP 

8 | 82 oo 84 

SAR Eine: 45 es SZ 8 

ier | ES: 7 5 Z. | 


SURVIVORS : 38 [es 90 


EOS 525. : (S36: pee 25S 2 ee ee | Zee 





Figure 3.20 AX: Percent losses from A-school. 
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: Attrition is not increasing. 


; Attrition is increasing. 
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Since T* falls in the acceptance region we 
accept Hy. Altrition is not increasing. 


Fieure 3.2! AX: Cox and Stuart Test Results #2. 


This ts a simple nonlinear model with one parameter (451) to be estimated 


per cell. Imbedded within the model is a couple of things worth mentioning. The 
enh! term represents the probability an individual from group (1.}) remains in the Navy 
till time t. This 1s the exponential survival function. Let Tj be the random variable 
that represents the probability distribution with survival function efit Due to the 
uniqueness of survival functions, qj = EXP(A;)). Hence, the timie spent imgoaare 
training 1s exponentially distributed. The next term to look at 1s nye Mit 
represents the number of individuals from rating 1 and year group } and e 


“43;" is nothing 


Ilere Di; 


hit is the 


probability an individual from group (i,j) survives till time t. So, nye 
more than the expected value of a Binomial random variable with parameters 
(n, p) = (njj, ehijhy, Now lets look at the model in its  eiitmeme 
[ Nij = nie"! ie ci | SP ORmascivener Nij can be thought of as a systematic term plus 
some noise (€j;). The systematic term is the expected value of a binomial distribution. 
It represents the expected number of survivors at time t. 

Our goal is to estimate hi We used the NLIN procedure in SAiSaame 
estimate the parameter A for each group. See Appendix C for a copy of the SAS 


program and the data vectors used by the program. Table X provides the results. 


b. Are attrition rates increasing? 





MODEL: i — Boy a BX: ag oF 


INDEX: 1 = year group 


Y; = attrition rate for year group 1 

py = constant attrition rate for all year groups 

p = change in Y due to a one unit change in X (slope) 
X; = year group 1 

¢: = error terms that are iid N(0,67) 
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XXXX = estimate of his 


(.VVVY) €- asymptotic standard error of the estimate 





Recall event B defined in our problem statement: Attrtion is increasing. 
fe will) use the linear regression model [ Y; = ff, + [},X; + &] to ascertain the 
validity of the statement. The lincar regression model permits us to statistically verify 
event B. We will test the regression coefficient B,. If PB, is statistically greater than 
Zero, then we will conclude: “Attrition rates are increasing.” Let us set up our 
hypothesis test. 

cst muimiber omems the Metest, As stated in Draper and Snuth, |Ref. 4: p. 
32], the F-test will determine if a trend cxists in the regression equation. The 


hypothesis test and decision rule associated with this test are listed in Table XI. 


TABUE 
LINBAR REGiessO Sar es i 


Hy: Pp, = 0 [Attrition rates are constant.] 


rT: B, + 0 [Attrition rates are not constant.] 


If F* S F(.95, 1, n-2) then conclude Hp 
If FP? > FC9S) 1, ne2 ikem coneiuae H, 





Test number two ts the one sided t-test. This test ts used after the F-test. If the F-test 
determines that a trend exists, then this test will determine the direction of the trend 
(Ref. 2: p. 68}. The hypothesis test and the decision rule associated with the one sided 
t-test are listed im Tabicsx 1: 


TABLE All 
LINEAR REGRESSION t-TEST #1 


Hy: p, <= 0 [Attrition rates are not increasing.| 


H,: B, > 0 [Attrition rates are increasing.| 


[ft* S t(.95, n-2) then conclude I, 
[ft* > t(.95, n-2) then conclude II, 





We performed three tests. See Figures 3.22,3.23, and 3.24 for seen 
results. The F-test results are listed in Table SJ1IT> thy all tires cascs se = ee 
decision rule, we accept H) and conclude: “Adtrition rates are constant.” The one sided 


t-test sequentially follows the F-test. Our results show that the F-test is not 


Statistically significant. In view of this fact, it’s not necessary to perform the t-test. 
Plenveyer, details of the t-test are listed in l-igures 3.22, 3.23, and 3.24. We summarize 


the results of the one sided t-test by saying, “Autrition rates are not increasing.” 


WBE rh) 
NeGiveoslON ON ATTRITION RATES: F-TEST RESULTS 


Rating Fe n Fiala) 


ne 





C. SPECIALIZED TRAINING 

The third event of our problem statement is: The amount of specialized training 
has increased. As previously discussed, we will measure the amount of specialized 
training by counting the number of NEC’s an individual acquires. Secondly, the 
measurement will take place during the individual’s second and third year of service. 
Two methods are presented. Given a year group, we looked at the average number of 
NEC’s per individual. We plugged these numbers into a regression model and tested 
this sequence to determine if an increastng trend existed. Method number two used a 
random sample of individuals from each year group. A balanced design ANOVA 
model determined if the average number of NEC’s per year group differed. The 
ensuing analysis excludes year group 84 because the data base does not cover their 


Boma year of service. 
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Kean Grou 







S df 55 MS oe ries 
Regression 1 0.000002 0.000002 2.1440 0.2807 
Enon 2 0.000002 0.000001 
Total 3 0.000004 

Re C.V. / MSE [ly 


0.5174 1$.92083 0.001022 0.005401 


Bi _ a s(b;) t* Pr>t* 
B, | 0.003728 0.001252 
p, l 0.000669 0.000457 1.4640 0.1403 
B; Clip bj Clip 
B, 0.000072 0.003728 +~—S-0.007384 
By -.000665 0.000457 0.002003 
F* = MSR/MSE F(.95,1,2) = 18.5 t* = bj/s(b,) t(.95,2) = 2.92 


Cl te 


Figure 3.22 AT: Attrition rates - Regression results. 
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AW: losses per month 








TiS 12S S| 


Year Group 









5 df So MS Pe Res 
Regression | 0.000020 0.000020 2100 0.1021 
Ee ror 6 0.000032 0.000005 
Total i 0.000052 

R C.V. ./MSE Jy 
0.3826 18.83448 W077 3 0.012718 
bi - P s(b;) t* Pr>t* 
By l 0.009107 0.001787 
os l 0.000683 0.000354 1.9280 0.0511 
bj Chip bj Clap 
Bo -.004729 0.009 107 0.013484 
B, -.000184 0.000683 0.001549 
ee vO R/S MSE 1(.95,),6) = 3.99 PO oO, ties5,0) = 1-94 


Cl: b; = 1(.95,6)s(b;) 


ligure 3.23 AW: Attrition rates - Regression results. 


Oo 








: losses per month 


82 
Year Group 


83 






S df 55 MS Pike 
Regression l 0.000204 0.000204 1.9954 O72935 
Enon 2 0.000204 0.000102 
Total 3 0.000408 

R? ne / MSE Jy 


0.4994 SE oa0es 0.010110 0.019487 


Bi uf 5 s(b;) t* Pr>t* 
p, 1 — -,003520 0.012382 
B, | 0.006387 0.004521 «1.4126 0.1466 
Bj Clip bj Club 
b, 049737 0.003520. (0.056757 
B, 013067 0,006387 (0.025841 
F* = MSR/MSE_ F(.95,1,2) = 18.5 t* = b,/s(b,) (.95,2) = 2.92 


Ch: bj # U(.95,2)s(b)) 


Figure 3.24 AX: Attrition rates - Regression results. 
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]. Average number of NEC’s per individual: Has it increased? 


TABLE XIV 
vie oi ivi be Ro Or INC S PEREINDIVIDUAL 








NEC, Ni AVG 
AT 81 369 232 1.5905 
$2 1010 524 1.9275 
83 619 365 1.6959 
AW 77 114 70 1.6286 
78 154 102 1.5098 
79 349 165 2.1152 
80 422 213 1.9812 
7 352 177 1.9887 | 
82 668 304 2.1974 
§3 444 243 1.8272 
AX 81 58 33 1.7576 
$2 D5 139 1.8345 
$3 133 79 1.6835 


For each rating and vear group, Table XIV lists the average number of NEC’s 
2 y = Oo 


per individual. This number is (NEC; / Nj) where: 


NEC; = number of NEC's acquired by year group 1 
N; = number of individuals in year group 1 


1 
We will set up the regression model and statistically test these table values for an 


upward trend. The model is hereby defined. 
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INDEX: 1 = year group 


Y; = average number of NEC’s per individual from year group 1 
By = constant number of NEC's per individual 
§, = change in Y per unit change in X (slope) 


X; = year group 1 
€- = error terms that are nd N(0, o*) 


The same methodology presented in the previous section will be used. The 
F-test will determine if'a trend exists and the one sided t-test will ascertain the direction 
of the trend. The hypothesis tests and decision rules are presented in Tables XV and 
De 

See Figures 3.25, 3.26, and 3.27. The test results clearly show that a trend is 
absent. The F-test forces us to accept the null hypothesis in all three cases. Likewise, 
the t-test directs us to accept the null hypothesis. We conclude this subsection by 


saying: “The average number of NEC's per individual is not increasing.” 


2. Average number of NEC’s per year group: Has it increased? 

The first method for determining the amount of specialized training condensed 
our data base into a few observations. We all know that a small sample size does not 
provide a powerful statistical result. The second method uses the single factor 
ANOVA model. We wanted to increase the number of observations in the test and use 
a balanced design. We took a random sample of 30 data points from each vear group 


and tested the sample means for statistical differences. We present the model. 
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TWABLE XV 
Pee IE Gali osOiN F-11522 


PO 
H,: B, #0 


Hy: Ihe average number of NEC’s per individual is constant. 


H,: The average number of NEC’s per individual is not constant. 


If F* S F(.95, 1, n-2) then conclude IH, 
lt b* > F(.95, 1, n-2) then conclude H, 


WE Ny | 
EINEARSREGRESSIONT EST #2 


Hy: 6, = 07 
ee, > O 


I1,: The average number of NEC’s per individual is not rising. 


lee 


;; The average number of NEC’s per individual is rising. 


Ift* = t(-95)n-2) then conclude H, 
Ift* > t(.95, n-2) then conclude HI, 
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| ENE OS Der person 





ics 


ie -- 7 ee —— ~~ — a 


§2 §3 


Year Group 


1.590 oe 1.696 































> df SS MS Be PR a 
Regression ] 0.005555 0.005555 0.1030 0.8022 
Error 1 0.053884 0.053884 
Total 2 0.059439 

R? C.V. MSE Jy 
0.0935 13.35641 On2321380 sd 967 

P; ie b; s(b;) t* Pr>t* 
By | 1.632600 0.354580 
p, | 0.052700 0.16414] 0.3210 0.4011 

p; Clip bj Clup 

By -2.81300 1.579867 6.078200 

p, - 2.00500 0.052700 2.110600 


F* = MSR/MSE F(.95,1,1) = 161 


t* = b,/s(b,) t(.95.1) = 6.31 


Cl; “baz (Cosas (ey) 


bigure 3.25 AT: NG's per individual Rearesciommne. mirc. 
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gee @ s-per person 


ate 
vee 


SO. 81 
Year Group 





S df Ss MS a PR> F* 
Regression 1 0.121506 ~=—-0.121506 2.350 0.1859 
Error SO 25S 542051 108 
Total 6 0.380048 

R? C.V. / MSE hy 
0.3197 12.01502  —- 0.227395 1.892586 
Di df bj s(b;) te ie 
By 1 1.629086 ~—-0.192184 
B, 1 0.065875 0.042974 —:1.5330 ~——:0.0929 
Bj Clip bj Clup 
By 1.135100 1.629086 2.123100 
B, -.044590 0.065875 0.176340 
F* = MSR/MSE F(.95,1,5) = 6.61 t® = b,/s(b,) t(.95,5) = 2.0 


Cl: b; + t(.95,2)s(b;) 


Figure 3.26 AW: NEC’s per individual - Regression results. 
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AX: NEC's per penson 





‘ 
2.00- - ° : 
1.00- 


0.00- 





§ | §2 83 
Year Group 


S df So MS Ee Pie re 
Regression I 0.002745 0.002745 Ors 6 Oneres 
Biron l 0.008656 0.008656 
Total Z 0.011402 

R? Gy. MSE Ity 
0.2408 5.29076 0.093040 1. 758585 
P; a bi s(b;) t* Pr>t 
By ] 1.832633 0.142121 
p l -.037050 0.065789 -.5630 0.6634 
B; Clip bi Clip 
Do 0.050795 ES32¢22 3.614500 
p, -.861880 -.037050 0.787780 
F* = MSR/MSE F(.95,1,1) = 161 C= bi /stb) ote) = Cet 


CE be 1G ss) 


Figure 3.27. AX: NEC’s per individual - Regression results. 
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SIODEL: Vij = leet: aaake 


J 
INDICES: 1 = year group 
}= jth individual from eell1 G = I,..., 30) 
Yi SeMumiocr Ol Nie@ s acquired by the jth individual from cell 1 
it = average number of NEC’s per individual 
T = additional number of NEC’s an individual from year group 1 receives 
ei; = error terms that are iid N(0,67) 


We will follow the same outline presented earlier when we used the single 
factor ANOVA model to analyze the length of basic training. Our objectives for this 
section are: 

e Estimate the mean number of NEC’s per year group. 

e Statistically test the means for differences. 

© Rank the means using a paired comparison test. 
The ANOVA model and the Kruskal-Wallis (KW) test will determine if the means 
differ. Tukey’s paired comparison test will rank the means. The hypothesis tests 
associated with the Analysis of Variance model and the Kruskal-Wallis test are listed in 
Meee Vij. The decision rules are also listed in Table XVII. 

Test results, tables, and figures that support this subsection are grouped 


together. They are laid out in the following manner. 
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TABLE XVII 
SINGLE FACTOR AQNOWASEReRO@UIIESIS TES hye? 


HH: 


H): 


Ef): The mean number of NEC’s per year group has remained constant. 


[{,: Not all the means are equal. 


-ANOVA- 
If F* S F(.95, v,, v4) then conclude HH 
If F* > FC95, V5 V5 ibenmconciuce H, 


ar 


«kK AVS 


If ee < 17(.95, v) then conclude Hy 


If x7 py > ¥°(.95, Vv) then conclude H, 





AT Figure 3.28 Data Analysis Graphs 
Table XVIII ANOVA/KW test results 


Figure 3.29. Tukey’s paired comparison test results 


AW Figure 3.30 Data Analysis Graphs 
Table XIX ANOVAPKW test results 


Figure 3.31 Tukey’s paired comparison test results 


AX Figure 3.32 Data Analysis Graphs 
Tale x ANOVA/KW test results 


Figure 3.33  Tukey’s paired comparison test results 


Figures 3.28, 3.30, and 3.32 provide a graplical summniary of the data sets. JTaples 
AVIH[, XIX, and XX provide the ANOVA test results and the Kruskal-Wallis test 


results. Figures 3.29, 3.31, and 3.33 provide Tukey’s paired comparison test results. 


OS 


Hilese figures display a graphical ranking of the means. Specific results are listed in the 
figures and tables. We summarize the findings. 

Sex Tatme: (P* < [)sand Cw < 4). By our decision rule, we accept Hy 
and conclude, “The mean number of NEC's acquired per year group has remained 
constant.” 

e AW rating: The P value is .001. The test results are statistically significant. The 
elements Of te t Vector are not equal. Using our decision rule, we accept the 
millerodiemaypotaesis, f19ure 3.51 provides a closer look at the differences. All 
means are grouped together under category A except year group 78. Those 
grouped together are not statistically different. Year Group 78 does not belong 
to group A, but look at the numbers. In particular, look at the largest mean 
fe, and loox vat the smallest mean (1.3). The difference is statistically 
significant but not operationally significant!” We conclude by saving: “A change 
occured but it is not operationally significant to influence training costs.” 

ame AX rating: (F* < F) and ew < y 7). By our decision rule, we accept Hp 


and draw the same conclusion stated for the AT rating, no increase. 


“We defined operationally significant as a factor of two or more. For, first term 
enlistees, increasing the number of NEC’s up to a factor two should have little eflect 
on training costs. The Navv’s C-schools should haye the capacity to train, more first 
terms enlistees. [fowever, (2 * 1.3) = 2.6 which is fairly close’ to 2.1. [here is a 
possibility that this change has more importance than we’vé given it. 
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| 
AT RATING 
| 


NUMBER OF NEC'S 





80 81 82 a5 84 
HA Tee Ce 


Figure 3.28 AT: INEC S per yearn croup: 
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tere Xvi 


AT: NEC’S PER YEAR GROUP 
ANOVA TEST RESULTS 


pie Or N tLe S- 














l ni % ji ate u 0; Ors 0.50 OS 
§ 1 30 Bon 1.700 O:6513 I 2 2 
§2 30 33.3 1.933 0.7397 l 2 2 
§3 30 33.3 1.567 0.6261 ] l 2 
90 ~=100.0 1.733 0.6837 | 2 2 
GEASS BEVELS VALUES 
= 3 81 82 83 
S df So MS ake 
Model 2 2.0667 P0s33 P7089 
Error Se 55335 0.4544 
otal 89 41.6000 
R? C.V. / MSE fly 
0.0497 38.8902 0.6740 es 3o3 


KRUSKAL-WALLIS NONPARAMETRIC TEST FOR EQUAL MEANS 
df x¢eyy PR > x7(.95, 2) 


2 4.1309 0.1268 





F(.95,2,87) = 3.11 7(.95,2) = 5.99 


al 


At Ree 


§3 


fit 7; 


1.7000 
ios) 
1.5667 


Means boxed togcther are not statistically different. 





Figure 3.29 AT: Tukcy’s paired comparison test results #2. 
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AW RATING 





ve 80 eZ 
YEAR GROUP 


icure: 3.30 8 Wee NEC S per year croup, 


i 


84 


TABLE NE. 























AW: NEC S PER YEA e@GiOur 
ANOVA TEST RESULTS 
-PERCON THEBES: 
1 n; “% ea, O; 25 0.50 Hos 
iO 14.3 367 0.5940 l 2 2 
73 wan0 14.3 1.300 0.6513 l l 2 
i930 las 1.800 0.9966 l 2 2 
SO =. 30 14.3 1.867 0.8604 l Zz 2 
S81 30 14.3 1.867 0.6815 l 2 2 
82 30 14.3 2.100 0.6074 2 2 jl 
S330 14.3 1.633 0.7184 l l 2 
210 100.0 lass 0.7611 l 2 Ze 
GEASS LEVELS VALUES 
T f 77 78 79 80 81 82 83 
S df Ss MS Fo PRs 

Model 6 12.0000 2.0000 Sere 0.0016 

Earor 203 109.0667 ORs ee 

Total 209 =: 121.0667 

R? CME MSE [ly 
0.099 | 42.2879 O30, L368 


KRUSKAL-WALLIS NONPARAMETRIC TEST FOR EQUAL MEANS 
df x7ey PR > ¥7(.95, 6) 


6 21265 0.0014 





F(.95,6,203) = 2.10 7(.95,6) = 12.59 
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AW RATING 





i) 78 Ue SO 8] 82 83 


1 n; eat 
aa 30 1.5667 
78 30 1.3000 
79 30 1.8000 
SO 30 1.8667 
8 | 30 1.8667 
§2 30 2.1000 
83 30 1.6333 


Means boxed together are not statistically different. 


Figure 3.31 AW: Tukey’s paired comparison test results #2. 
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Figure 3.32 


AX RATING 





YEAR GROUP 


AX: NEC’s per year group. 
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NE Ie eX 


AX: NEC’S PER YEAR GROUP 
ANNO TEST RESULTS 























“PERGENTILES- 
i n; oye ar t; oF 0.25 0.50 O75 
oa 30 33.3 1.767 0.7279 l 2 2 
§2 30 coe ieooe 0.5560 l 2 2 
§3 30 33.3 1.400 0.6215 | ] 2 
90 ~=—:100.0 1.600 0.6500 l 2 2 
CLASS PeVELs Neel 
T 3 SiS as 
S df SS MS Polit = bs 
Model 2 2.0667 P0S3s Dose WOE S5 
Error oT BD 355 0.4084 
Moral §9 37.6000 
R? C.V. / MSE [ly 
0.0550 39.9428 0.6391 1.6000 


KRUSKAL-WALLIS NONPARAMETRIC TEST FOR EQUAL MEANS 
df  xeey = PR > 70.95, 2) 


2 4.26 0.1186 





F(.95,2,1109) = 3.00 %(.95,2) = 5.99 


1A 


AX RATING 





1 ns ji aha u 
§ 1 30 eGo) 
§2 30 Ipooe. 
§3 30 1.4000 


Means boxed together are not statistically different. 


Figure 3.33. AX: Tukey’s paired comparison test results #2. 


LV ete RES LTS AND CONCLUSIONS 


We started off with the following question, “What are the factors causing training 


costs to rise?” To understand the problem, we formulated several reasons why we 


think training costs are rising. Those reasons are: 


The length of basic training has increased. 


e Attrition has increased. 


The amount of specialized training has increased. 


We set out to verify those reasons using some historical data compiled by CNA. 


The scope of this study is limited. The results are valid within the following 


confines. 


2 


Inferences are made with respect to these enlisted ratings, AT, AW, and AX. 

The éxpected career path is Boot Camp — A-School — Ficct. Inferences are 
further restricted to those individuals that followed the expected carcer path. 

ire Overall time frame 15 restricted to the first enlistment period. 

The first 24 months is the time constraint for two areas of study, Basic Training 
and Attrition. 

The second and third years of service is the time constraint for the last areca of 


study, Specialized Training. 


SUMMARY 

(Length of Basic Training — nota factor) The length of basic training has 
cycled up and down. It has fluctuated over the years but there is no evidence 
to suggest a steady increase over the years. Figures 4.1, 4.2, and 4.5 provide 
graphical summaries. In all three cases, the final trend is encouraging, the 
length of basic training has decreased. 

(Attrition — nota factor) Losses in Basic Training are roughly constant from 
year to vear. Attrition has not increased. 

(Amount of specialized training — not a factor) Specialized training has 


remained constant. [he amount has not increased. 
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MONTHS 
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Figure 4.1 AT: Length of basic training. 
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Figure 4.2 AW: Length of basic training. 
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Figure 4.3 AX: Length of basic training. 
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B. RECOMMENDATIONS 
This study looked at a small piece of the problem. The final result is that we 
were unable to identify any factors causing training costs to rise. However, here is a 
list of general questions that may be of interest for further research. 
1. Has the length of basic training increased for enlisted ratings other than AT, 
AW, and AX? 
2. Has the amount of specialized training increased after the first enlistment 
period? 
Pweis tne Sclection process effective?? 
4. Has the Training Command's support costs increased? 
5. Are training costs rising due to increased or improved training resources? 
This list is by no means exhaustive. It is a few questions that we can ask but were 


unable to answer in this study. 


The selection process is primarily based upon test scores and education level. If 
the selection process is elfective, then people screened for a particular rating will 
complete that training program. The attrition rate will be low and survivability high. 
Ifowever, if we do not screen people properly, the number of people that_complcte the 
program will be much less than optimal. Attrition will be high. The effect ts higher 
training costs. An effective selection process produces savings. 
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APPENDIX A 
MODEL ASSUMPTIONS 


Throughout this study, we used two models extensively, the REGRESS oiias 
model and the ANOVA model. Both models helped us to conceptualize the problem 
and analyze the observations. The purpose of both models is to describe the events of 
the past. These models are also used to predict and eontrol ewents, but weteser 
interested in using it for those matters. 

In this appendix, we will briefly assess the aptness of the model. Is the model 
appropriate for the data set at hand? This is an important question. It should be 
answered whenever models are used. The mnportance of aptness is best described by 
logic’s implication statement, if P then Q, (P ~ Q). If the model is appropriate, then 
the ensuing analysis presented by the model 1s correct. Good analysis is conditioned 
on the fact that the analyst use the appropriate models. The appropriateness of a 
model is dependent upon adherence to the assumptions :mbedded within the model. 

We emphasized the importance of examining the aptness of a model, but how do 
we confirm that a model is appropriate? Residual analysis 1s the tool for this task. It 
is highly effective for spotting major departures from the assumed model. Our goal is 
to verify the model assumptions by using residual analysis. In the statistical world, this 
verification follows the mentality used in the U.S. court system, where we assume tlic 
defendant to be innocent and prove bevond reason of doubt that the person is guilty. 
In our profession, we assume the model assumptions are correct and prove otherwise. 
The major purpose of residual analysis is to detect serious departures from the 
conditions assumed by the model. 

Strict adherence to every assumption 1s not possible with this data set. A few 
departures exist however, the departures are not substantial. Our first diseussion 
centers around the regression model. The second part deals with the ANOVA model. 
Assumptions are listed for each model. This 1s followed by a short summary discussing 
the verification procedures and any effects caused by a departure from the model. 
Figures and tables pertain to the AW rating. Similar results were obtained for the AT 


and AX ratings. 
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l. REGRESSION 

We used graphical means to confirm the assumptions imbedded within the 
regression model. (See Table AXJ.) The assumptions are listed in column one. The 
plots used to confirm these assumptions are listed in column two. Our goal Is to 


ensure the assumptions are plausible in light of the data. 


ABE E ANd 
REGRESSION IODELASSOMPTIONS 


Assumption Verification 

1. The relationship 1s hnear. Scatter Clot 

ee ie error terms are independent. Re SD vs 4 
RESID vs YHAT 

3. The error terms have constant variance. Rips sos | 
RESID s yHAT 

4. The error terms are normally distributed. Q-Q Plot 


a. The relationship is linear. 

Whether or not a linear regression function is appropriate for the data sct at 
hand being analyzed, can often be studied by a scatter plot of the data. (Sce Figure 
Al.) These scatter plots are an effective means to examine the appropriateness of the 
linear regression function. Notice that these plots do not exhibit any departures from 


the model. 


b. The errors are independent and have constant variance. 
Dietnesinodenecorrectiy descripes the Observations, the (RESID vs X) plot and 
the (RESID vs YHAT) plot should display a pattern that’s uniformly distributed within 
embonzontal band centered @at zcrose(see Figures A.2 and A.3.) It portrays the 


prescribed behavior. No trends are present. 


SS 


c. Thevervor tenmsrace moun. 
The residuals should resemble observations taken from a normal distribution. 
The Q-Q plots are used to confirm this. Figure A.4 displays these plots. They appear 
to be normally distributed. 
In summary, no serious departures from the assumptions were notcd. The 


linear regression model ts appropriate for the data set at hand. 


2. ANALYSIS OF VARIANCE 
The assumptions imbedded within the ANOVA model are similar to the 
regression modcl. Sce Table XXII for a list of the assumptions and the verification 


method. 


TABEE el 
ANOVA MODEL ASSUMPTIONS 


Assumption Verification 


The populations are normally distributed. 


The population variances are equal. Bartlett Test 
Hartley Test 


The error terms are mdependent. Durbin-Watson Test 


The error terms have constant variance. RESID vs X 
RESID a YEA TL 


The error terms are normally distributed. Histogram 





a. The populations are normally distributed. 
The first assumption requires the populations to be normally distributed. 
Formal verification will not be presented here. It will suflice to say that upon 
cxamination of the data scts, we found most of the populations to lack normality. 
Here in lies the first departure from the model, but the departure 1s not large. Lack of 
normality is not an important matter provided the departure from normality is not of 


extreme form. The point estimators of factor level means and contrasts are unbiased 


SO 


whether or not the populations are normal. The F-test for equality of means is but 
little affected by lack of normality, either in terms of level of significance or power of 
the test. Hence the F-test is a robusr test against departures from normality. [Ref. 2: 
p. 624] 


b. The population variances are equal. 
The second assumption requires equal variances. We used the Bartlett test or 


the Hartley test to verify homogeniety of variance. Let’s discuss where we applied each 


Lest. 
I. Basic Training: Bartlett Test - (unequal sample sizes) 
The idea underlying Bartlett’s test? is simple. By definition: 
MSE = (1/dfp)¥ df, s,” (eqn A.1) 
GMSE = [(s,7)% x ... x (s 2, y U/l) (eqn A.2) 


The relationship between the arithmetic mean and the geometric mean 1s: 
viol = MSE (eqn A.3) 


The two averages will be equal if s. = Sj Nencet tictamovist/G VISE) is close to 
one, we have evidence the variances are equal. If the ratio is large, it indicates that the 


population variances are unequal. Bartlett’s test statistic 1s computed as follows: 


47~ = (dfy/C \ log, MSE — log.GMSE ) (eqn A.4) 
Mere: 
eee 3 ti) (ide) = (/de-) } (eqn A.5) 


Ref. 2: Sec. 18.6] provides a detailed discussion of this test. 
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The population variances are listed in Table AXIII. We statistically tested 
these values to degermine if they were equal. The hypothesis test and decision rule 
associated with Bartlett’s test are listed in Table XXIII. The results are also listed in 
Table XAIII. 

With respect to the AW and AX ratings, we accept the null hypothesis and 
conclude, the population variances are equal. However, we cannot say the same for 
the AT rating. Departure from this model assumption has some effect. How sensitive 
1s the model with respect to this departure? 

When the error variances are unequal, the F-test for equality of means is 
only slightly affected if all factor level sample sizes are equal or do not differ greatly. 
Specifically, unequal error variances raise the actual level of significance only slightly 
higher than the specified level. The F-test is robust against unequal variances when the 
sample sizes are approximatcly equal. (Ref. 2: p. 624]. 

Let’s look at this aspect more closely. For the AT rating, the population 
varlances are unequal and the sample sizes are uncqual. We expect the significance 
level to be inflated. However, if a large inflation factor existed, it would not have 
affected this ANOVA test very much. This is due to the fact that the test results were 
significant at the .0001 level! The difference in means is causing the significance level 
to be extremely small. It’s overpowcring any inflationary effect caused by unequal 
variances. The actual probability that the means are equal is somewhat less than 
0001. In summary, a departure from the model is present, the population variances 
are not cqual. However, this docs not bias the truce results very much. In this case, we 


accept thesyalidiny otic F-test re smins 


2. Spectaliged Training: Hartley Test - (equal sample sizes) 
I‘or equal sample sizes, Hartley's test? for equality of variance is based 
solely on the largest sampie variance and the smallest sample variance. Hartley's test 


Statistic is defined as follows: 


H* = max(s,*)/min(s.”) (eqn A.6) 


IRef. 2: Sec. 18.6] provides a detailed discussion of this test. 


SS 


Clearly, values of H™ near one support the claim that the population variances are 
equal. The variances for each population are hsted in Table XAIV. The hypothesis 
test and decision rule associated with the Hartley test are listed in Table NAIV. The 
MesWitceeare also listed in Table NNIJV. For all three test cases, we conclude the 


population variances are equal. 


c. The error terms are independent. 

ics thitG@eassUmpMOn Trequites ihe error terms to be independent. Lack of 
independence can have serious effects on the inferences made using the ANOVA 
output. The observations were obtained in time sequence, so there 1s a good chance 
the error terms are serially correlated or autocorrelated. 

The most popular test for first-order autoregressive errors is_ the 
iaroin-vVatson (1D-W) test. It’s a powerful test yet extremely easy to use. Sce [Ref. 5: 
Sec. 15.3] for a detailed commentary on the (D-W) statistic. The original model 
specifies the error terms (f.) to be independent and identically distributed NO) 
random variables. The underlying arguement for the D-W test is stmple. Model the 


error term as a first-order autoregressive process such that: 


eee (eqmy |) 
where: 
f = autocorrelation parameter such that |p| < 1 
v. = disturbance terms that are 1d N(0,07) 


1 
Pace cimer term Includes aeipaction oO! the previous error term plus a 


disturbance term. If p = 0, then & = v., and we're back to our original assumption 
because the disturbance terms (v,) are independent. The D-W test determines if p = 0. 
The hypothesis test and decision rule associated with the D-W test are listed in Table 
p<. The Durbin-Watson test results are also listed in Table XNV. For every test 
case, We conclude: “The autocorrelation parameter — is zero hence, the error terms are 


independent.” 


d. The error terms have constant variance. 
Assumption number four requires the error terms to have constant variance. 
(See Figures A.5 through A.8.) We ploted the residuals against the independent 


variable and the fitted value. No discernable pattern emerged. The residuals he within 
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a horizontal band centered at zero. Notice how the variance stays constant through 
changes on the X-axis. This behavior is the expected behavior given the assumption is 


correct. These plots give us no reason to reject the fourth assumption. 


e. The error terms are normally distributed. 

The last assumption requires the error terms to be normally distributed. We 
plotted the residuals in the form of a histogram. (See Figures A.9 and A.10.) Both 
plots resemble a normal distribution with mean zero. These plots verify the last 
assumption. 

In summary, the assumptions are reasonable. We have no reason to reject 
them as incorrect. There is a few minor departures from the model, but due tomthe 
robusiness of the F-test, these departures did not affect the final results. We conclude: 


The ANOVA model is appropriate for the data set at hand. 
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Figure A.l AW Regression: Scatter Plot. 
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Figure A.2. AW Regression: RESID vs X Plot. 


oe 


AW: LOSSES PER MONTH 


0.004 





N 
oS 
3 
a 
= 
i) 
& {0.010 0.011 0.012 0.013 0.014 
9 
N 
oO 
Oo 
S . 
| 
a a 
2 
o 
T 
YHAT 


AW: NEC'S PER PERSON 


e 


RESID 





YHAT 


Fnewirews..0 7) KRecression: RESID vs YHAT Plot. 


oo 


PROBABILITY PLOT 


a me oe oe we oe oe of 


4 
' 
t 
J 
) 
Bs 
1 
' 
' 
' 
' 
' 
aS 
' 
1 
1 
1 
' 
' 
4 
' 
' 
' 
) 
' 
) 


{ 
I 
t 
- 
t 
t 








0.004 


RESIDUAL 


PROBABILITY PLOT 


Jd 





RESIDUAL 


Figure A.4 AW Regression: Q-Q Plot. 


94 


TABEE X11 


Ht, 





OO Ga 
STTONAYCIOOMN 
CIE oo Cs 
=~ ~ CQNGOONETN 


CVE ate C16 16 


CPA Soe enOign 
—=N = nC 


<i SOs oec1Se 
— NOL Oe 
Cf One Se 


eo @¢©  j#®  * oe ee @ 


eS et ee ee ee ee ee et —_——_— —— 


[Ao o—(1e 
t~ (~~ C3 0000 99 


AW 


n 
2 


yon 


Dee 
Hy: S| 3 


=O 
n 


2 
H): Gi 


If 172 < ¥7(.95, v) then conclude Hp 


ii a > 47(.95, v) then conclude H, 


4°(.95, V) 


Yar 


H, 
I, 


H 


Soot 
[2G 
527 71S 


6.2084 
4.7408 
0.0598 


2 


AT 


6 
2 


AW 


CS 


o5 


TABLE AAIV 





2 ee] —@ 


SS = SS Se 
menencmencnen 


ee e«© e# ®©  @®  ® « 


Owen 


AX 


= TGs, vi: v5) then conclude H, 


if H* 


lf Ssiees: as V,) then conclude FH, 


Gey viv) 


He 


| 


— 


Hy 
Be 


Bi 


2.4000 
3.0200 
2.4000 


oe 
2.5149 


20 
a) 


AT 


AW 


Ieel33 


p2S 


90 


ABIL XN 
DURBIN-WATSON TEST 


Hy: p = 0 
H,): p> 0 


If DW > db then conclude H, 
Lay = dip then conclude Hy 


If di, = a diy then the test is inconclusive 


Time to get rated 


Single Factor ANOVA Model 


pw* d 


Ib ub 


2.022 
2.004 
2.080 


NEC's per year group 
Single Factor ANOVA Model 


DW* tb db 


re) eG25 
2.040 oS 
ZL St) 635 





a 


' 
' 
' 
' 
' 
' 
t 
33 % 4 : ‘ ; + : 4 
' 
' 
' 
' 
' 
' 
' 
. ' . 
Hd + 3 + ae 3 3 a4 qe 7 = 3 
' 
' 
' 
' 
' 
' 
' 
' 
3 Er 7 = oe % 3 3 + di 3 9 ; % 3 % 
' 
' 
' 
' 
' 
' 
' 
3 3 : : 4 : 3 a i a 3 5 - * 
' 
' 
' 
' 
' 
' 
' 
+ 3 = * 3 4 ce: % cA 2 $ om 
' 
' 
' 
' 
' 
! 
t 
. are . . 
3 + 3 : 33 5 a 0% 7. 3 3 3? an 3 
' 
' 
' 
! 
¢ 
' 
' 
' 
3 ? 3 3 % ie 3 ; 3? 3 3 3 
' 
' 
' 
' 
' 
' 
' 


LEE EEREEEAEEEEEEFEE EEE EE EEEFEEEEEFEHAEE EEE EEE EEE EEE HEF 
cx co C= \O ry) SE (>) ON — = — ON cm i ry) \O fo oD x 


ow (1) nn aa = 


$2 


— 80 $1 
Year Croup 


ps, 


gct rated - RESID vs X. 


Soe’ 


AW ANOVAS Pinmete 


TUTE 


Fi 


98 


ce 


FEEL EEEEAEFEEEEEEEEFEFEFE EEE E +E HHH +4t+ttt++4+tt4t44 


— — _ oO oO 


= 


on) 


ee eee ee e222 8 BBB SF Ee BBO SESBSFZBSFZSBSBSEOSBBSEBESBEEBETZ ETF BHF BB 2 es 2 e2 ZB eB Be e222 ee eee 22 
le 


© 


* 


oS 


G 


= 


7. 
— 
t 


— 
4 


= 
4 


% 


82 $3 


S| 


SO 
Ned Onoup 


78 ag 


ek 


NEC's per year group - RESID vs X. 


Figure A.6 AW ANOVA: 


oo 


3? 
3 5 
= 3 5 9 
4 % x 
4 9 > 
3 


+t+++4+tt+4+44444 
CoO Beas \O w) 


32 ; 
+ . 
, 32 3 
% 5 + 
) 3 32 
> 32 . 
3 3: 
* . 


+4++++++4++ 
<—J cM N 


pf 


FHEEHEE HEHE FEF H HEHE FH HHH 444444444 
— Oo — N co my Ww) \O oO 


(1) 


7 


YN 


= ereermemeenteensteeeseeseeeent eee ese ee eee see nse @seee see semeeewesBteense ses eneeenstne eee ne eee nena nmeeeaesemeeaeweeeaeeaeeea a = = = 
ate be awe ‘ae 
: 


+ 


‘eo 


eo 


3% 


ae 


a 


14.0 T4.5 [Se TSC lo -omneml G 
YHAT 


30 


Figure A.7 


fime to get rated = RESPD vs-YHA F. 


AW ANOVA: 


ee’ 


100 


= 


faa 


Ss 


(1) 


= 


Y 


ee ee ee re 
ate 
oa" 


= 


—I 


So 


i) 


4 


+ 


FEAF EAEEE EAE EAE HEE FEF HEE E FEE FFHEFFAFFF FFP EPH E HHP +4 44444 
on wn Q ON \o 7 oO 7 \o On N a oo 


. 
— 


Ee: 21.5. ATO Foodies Oe2.1 2.2 
era 


Iso 


Figure A.8 AW ANOVA: NEC’s per year group - RESID vs YHAT. 
10] 


FICE 
EQ 


+ 





i) 
Is 
So 
+ 





ate als ate alo ate ofe ate ato 
Fee ee nee oe? Fee Het mee nee 


— i] 
© © 


a ean 
© 
FEEEEEE EE TELE EEE EEE EFF FETE FEF EET FH HH Ht ttt test 


12 


| 


© 


> 





ate ote ote ote ate ate ate ate ete ate ate ate ate ate a8e ale ate ate ate ate 
ar Mgr ge nage Tee fee tee tee fet ge net nee ten tee cee eee Fes tar cen rae 





as 
© 








we we ate Se Se te ate 
aor ar har oe wer rer rer mee 
ate ats ate alo ate ate ats ah 





Pee Mer ME M4e 


bt 
© 


wae o8s 08s ate ale ate o8s oho 
Set eet tee Bee Se" Het Fer Mee 





xe 

ee Tes Sys cee “o 
ate alas ole x's ate ate ate ate 
tye eee tee Fee Fee Bae age cee 








-2 0 Z 4 
Ries TD 





‘igure AQ AW ANOVA: Time to get rated - Histogram of residuals. 


102 


nee 





See oe 





ale 
ae 





ate De 
Set toe 





we Se 
: ° 


ate we 
See rae 








See ar Me 





FHEEEEEEEE FEE EEEEEEEEE EAE EEEEEEEEEEEEEE EEE EEF EE EE EEE EEES 


Figure A.1O AW ANOVA: NEC’s per year group - Histogram of residuals. 


APPENDIAX B 


DATA BASE 


The data base used in this study is described below. Column one its the variable 


list. Column two gives the location of the variable within the data base. Column three 


is a description of the variable. 


BLES 


RECNUM 
PGMCODE 
RATING 


AREA 
SEX 


CIVED 


GEDC 


WAVCE 


AFQT 
TEST SW 


GS 
AR 
WK 
EC 
ND 
CS 


POSITION 


001-009 


C10S0is 
014-016 


Ot7 07 
018-018 


Or =020 


UZi=02) 


O2Z250 22 


232025 
026-026 


OZ 72029 
O30s032 
OS3e035 
036-038 
039-041 
042-044 


DESCRIPTION 


Record number 
Program Code (SG) 


Rate: 

AT = Aviation Technician 

AW = Aviation ASW Operator 

AX = Aviation ASW Technician 
Recruiting Area (1, 3, 4, 5, 7yee 
1 = Male 

Q = Female 

Civilian education: 

This is the number of years. of 
education completed. 

Graduate education code: 

1 = High school diploma graduate 

2 = Probable Graduate 

3 = Graduate equivalence diploma 

4 = Non-High school graduate 


Waiver for civilian education: 

1 = The recruit received a waiver for 
entry into the the rating program 
desired due to lack of sufficient 
education | 

Q = otherwise. 

Armed Forces Quotient Test score 
Test_score waiver: | 

1 = The recruit received a waiver for 
entry into the the rating program 
desired due to low test scores. 

Q = otherwise. 

General Science test score 

Arithmetic Reasoning test score 

Word Knowledge test score 

Paragraph Comprehension test score 
Numerical Operations test score 


Coding Speed test score 
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AS 
MK 
MC 
=| 
RACE 


PAYGRD 
SCREEN 


MATCHS 


AU CrLG 


AeEPS 


RATE 
PAYGR 
ADSD 
PEBD 
EAOS 
COMPLD 
COURSE 
STUDAC 


AGE 
LEFTNAV 


045-047 
048-050 
051-053 
054-056 
Voy son 


058-058 
Hsgs05 1 


062-062 


063-063 


064-064 


065-068 
069-069 
070-075 
Uy o-=079 
080-083 
084-087 
USsoeol 
225038 


094-096 
Oe 72097 


Auto Shop test score 
Math Knowledge test score 
Mechanical Comprehension test score 


PlcerrOmleslnrOrmation teste score 


Oo 
) 
m 


Caucasian 

ac 
Other 
Unknown 
American Indian 
Asian 


e 
B 
X 
Z 
R 


Te 


M 
Initial Paygrade (1-9) 


Screen Score: ae 
Miss tnemprmooddpi lity a recruit 
will complete one Peas of service. 
Screen scores were aeveloped at CNA. 


Matched SCAT Flag: 
1 = yes 


O = no 
SCAT = System Consolodation for 
Accessions and Trainees 


Recruit Training Command Flag: 
1 = completed Boot Camp 
= did not complete Boot Camp 


rimary Dependents: 

= no Pa ey dependents 
= spouse only 

= spouse and 1 child 


" Por-OouU © 


9 = spouse and 8 children or more 
A = no spouse and 1 child 
H = no spouse and 8 children or more 


Present Rating 

Present Paygrade 

Active Duty Service Date 

Pay Entry Base Date 

End of Active Obligated Service 
Year~Month NITRAS course completed 
NITRAS course code 

NITRAS student action code 

NITRAS = Navy Integrated Training 
System 

Age of recruit 

Left Navy fla 


I = person lert the Navy 
0 = person did not leave the Navy 
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ATEAOS 


MOSIN 


COMPS 
RATEF 
BLANK 
EZ 

ES 

E4 

E5 
INITRAT 
RATCHG 
DATE 
RATING 
DATE 
RATING 
DATE 
RATING 
PAYCHG 
DATE 


PAYGRADE 


DATE 


PAYGRADE 


DATE 


PAYGRADE 


DATE 


PAYGRADE 


DATE 


PAYGRADE 


NECCHG 
DATE 
NEC 
DATE 


O9es0g2 


OS gale 


102-104 
10Seree 
1O9= 02 
10 Siar 
1 Sse 
Pica. 
Pig= ai 
IZ22=> 
BY 827 
Zea oe 
131-134 
Som 157 
138-141 
142-144 
145-148 
149-150 
Lo del oe 
1Q= boo 
eon log 
160-160 
161-164 
oom. 
166-169 
170=1770 
l/iei4 
l/Ssely> 
7a 
178-180 
181-184 
iSo- lee 


EAOS Flag: 

1 = person left at EAOS 

0 = otherwise 

Given a person lett the Nene gp is 
the number of mont ve On ac ive duty. 
If the person is still on deive 
duty, the field is ated Oe 
Composite test score 

Final Rating 

Blank column 

Months in paygrade E2 
Months in paygrade E3 
Months in paygrade E4 
Months in paygrade E5 
Initial Rating 

Number of Rating Changes 
Month-Year of change 

Rating Code 

Month-Year of change 

Rating Code 

Month-Year of change 

Rating Code 

Number of Paygrade Changes 
Year-Month of change 
Paygrade Code 

Year-Month of change 
Paygrade Code 

Year-Month of change 
Paygrade Code 

Year-Month of change 
Paygrade Code 

Year-Month of change 
Paygrade Code 

Number of NEC changes 
Month-Year of change 

NEC code 


Month-Year of change 
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Pee N DTN C 
PROGRAM LISTING 
We built our models using the SAS programming language. SAS provided us 


numerical computation, statistical results, and graphical summaries. SAS programs 
used by this study are listed 1n the following order. 


Basic Training: Two Factor ANOVA Model 
ie single Factor ANOVA Model 
Attrition: Non-Linear Exponential Model 
a: _ Sunple Linear Regression Model 
Specialized Training: Simple Linear Regression Model 


Single Factor ANOVA Model 
BASIC TRAINING 
Two Factor ANOVA Model 
OPTIONS LINESIZE=80; 
A MTGR; 


Dar GR; 
ieee MONTHS TO GET RATED’ :; 
INPUT IMRS Y; 
LABEL I = ID NUMBER; 
LABEL M = MONTHS TO GET RATED; 
Poeely hee RATING: 
LABEL Y = YEAR; 
CARDS; 


PROC, GLI DATA=MTCR 7 P Cla: 
CLASS R Y; 
MODEL M =R Y R*Y / P CLI; 


MEANS Y; 

MEANS Y/TUKEY; 

OUTPUT OUT = STATS P = YHAT R = RESID; 
PROC PLOT DATA = STATS; 

PLOT M* Y = Te"; 


Single Factor ANOVA Model 


S&TIONsS LINESIZE=80: 
DATA BTP: 
TITLE 'MONTHS TO GET RATED' ; 


PEO shy -; 
LABEL I = ID NUMBER; 
LABEL M = MONTHS TO GET RATED; 
LABEL Y = YEAR; 

Cans: 

PROC GLM DATA=EBTP; 
iD 1: 
GEASS Y 


MODEL M’= Y ¢ Pei. 
MEANS Y/TUKEY; 
OUTPUT OUT = STATS P = YHAT R = RESID; 
PROC FLOT DATA = STATS; 


Mo yy: 

Bete vear * Yo=a'*!' s 

PHOT RESID * ¥Y = '*' / VREF = 0 ; 
RESID * YHAT = '*' / VREF = O; 


PLOT 
Meese CHART DATA = STATS; 
VBARK RESID; 
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ATTRITION 
Non Linear Exponential Model 
OPTIONS LINESIZE=80; 
DATA A 


TITLE ATER TT IONE Aas E( NT) = N*EXP( -LAMBDA*T) ' 
INPUS il LCE NT Paha, 
N Pee: 
LABEL I = Dineen: MONTHS; 
LABEL LL = LOse, 
LABEL CL = CUMULATIVE ELOss. 
LABEL NI = NUMBER OF SUR fete at) tiie. 
LABEL P = PERCENT OF SURV CRs a iia 
LABEL R = RATING; 
LABEL, y ~— THAR GROUP; 
CARDS; 
PROC NLIN DATA=TAR; 


PARAMETERS LAMBDA = .OQ1; 
MODEL NT = N*EXP GERET CARRE 
DER. LAMBDA = -N*T*EXP( -LAM ana 


OUTPUT OUT = STATS NTHA Re Rien, 
PROC_PLOT DATA = STATS; 

PLOT NT*T = ‘A’ NTHAT*T = 'P' / OVERLAY; 

PLOD Rao UD. > VEER ee. 


PEROT RESID: * NT me s/, VREF = 0; 
PROC CHART DATA = ST 
VBARK RESID; 
Simple Linear Reeresstion Model 
OPTIONS LINESIZE=80; 
ATA AR; 


TTTLE ‘ATTRITION RATES! ; 
INPUT AR Y 


LABEL A = ATTRITION RATE; 

LABEL R = RATING; 

LABEL Y = YEAR GROUP; 

LABEL X = GROUP NUMBER; 
CARDS; 


PROC REG DATA=AR; 
MODEL A = X DW P_ R CLI CLM TNE RUE NEE. 
OUTPUT OUT=STATS P=PRED R=RESID 
COOKD=CD _H=HAT RSTUDENT=RS; 
PROC PLOT, DATA = siAls- 
POL ans 


PLOT PRED*X; 
PLOT RESID*X / VREF = O; 
PLOT RESID* PRED ee VREE = Q; 
PLOT HAT*X / VRE 

PLOT RS*A / VREE 


Phod X 7/ VREF 
PROC CHART; 
VBAR RESID; 


/ 


OF 
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| SPECIALIZED TRAINING 
Simple Linear Regression Model 
OPTIONS LINESIZE=80; 


DATA _NECAVG; 
TITLE ‘NECS a INDI VeSuUAl 


INPUT NSA a 
LABEL N = NUMBER OF NECS; 
LABEL S = SIZE OF YEAR GROUP 
LABEL A = AVERAGE NUMBER OF NEC! © PER INDIVIDUAL; 
LABEL Y = YEAR GROUP; 
LABEL X = SUBGROUP; 
ear 


PROC REG DATA=TNECAVG; 
ID X; 


Mepei A = x Dyer Keely Chl INP EVENeE; 
OUTPUT OUT=STATS P=PRED  R=RESID 
COOKD=CD H=HAT RSTUDENT=RS; 
eee bor DATA = SIATS; 
EOr Axx; 


PLOT PRED*X; 

PLOT RESID*X / VREF = 0; 

PLOT RESID*PRED gv VREE = 0; 

PLOT HAT*X / VREF = 0; 

PLOT RS*A / VREF = 0 

PLOT CD*X 7 VREF 
PROC CHART; 

VBAR RESID; 


e 
/ 


OF 


Single Factor Anova Model 


OPTIONS LINESIZE=80; 

PAtA INEC; 
TITLE 'NECS PER YEAR GROUP'; 
meee tf MN Y R; 


Seat. +. N> 
LABEL I = ID NUMBER; 
LABEL M = SECOND YEAR NUMBER OF NECS; 
LABEL N = THIRD YEAR NUMBER OF NECS; 
mip ane = YRAR- 
LABEL R = RATING; 

CARDS; 

PROC SLM DATA=TNEC:; 

CLASS Y; 

MODEL S = Y / P CLI; 

MEANS Y: 


MEANS ¥/TUKEY ; 
SUE OUT = STATS Ee=esHADIOR = SRESID: 
PROC PLOT DATA = STATS; 


PLOT S$ 
PLOT SHAT a 


at 
POMS keEolDy = Y / VREES=20 
Peloton tslb’ * SHAT / VYREE = ©; 
Peoe CHART DATA = STATS; 


Wine oR LD; 
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APPENDIX D 
DATA VECTORS 


These are the numerical values we used in the SAS programs. Numbers used in 
the first two ANOVA models will not be listed. 


ATTRITION 


Non-Linear Regression Data Set 


AT AW AX 


00 00 OO 131 1.000 
O01 06 06 125 0.954 
O02 02 08 123 0.938 
O03 02 10 121 0.925 
OG 01 11 120° 0.9% 
05 O01 12 119 0.908 
06 O02 14 117 0.893 
07 O01 15 116 0.885 
OS Ol 16 115 O2877 
19 01 17 114 0.870 
20 O01 18 113 0.862 


NONNNN NN NN DPD PP 
~ 
N 


00 00 00 173 1.000 
Ol 12 12 16170-7206 
02 05 17 156 0.901 
05 01 18 155 0.895 
06 O1 19 154 0.890 
09 O01 20 153 0.884 
11 OF di 152 0.G74G 
¥2 Ol 22 15) 0.872 
16 01 23 150 0.867 
17 O2 25 148 0.855 
18 O01 26 147 0.849 
20 O1 27 146 0.843 
21 01 28 145 0.838 
23 02 30 143 0.826 
2¢ 01 31 142 0.820 


NNN NNNNNNNNNN DN PY 
“J 
oO 


00 00 00 330 1.000 
Ol A 19 320. 34a2 
02 19 38 292 0.884 
0¢ 02 40 290 0.878 
05 02 42 288 0.872 
06 O02 44 286 0.866 
03 01 45 285 0.863 
09 03 48 282 0.854 
10 O01 49 281 0.851 
11 01 50 280 0.848 
12 02 52 278 0.842 
13 G1 53 277 G26s? 
14° 02 557275 0.5335 
1S 029575273. O7627 
16 O01 58 272 0.824 
22 O02 60 270 0.818 
25 0) 61 269°C. S15 


NNNNNNN NN NN NNN DD PY 
~J 
so 


00 OO 00 324 1.000 2 80 
0110 10 314 0.969 2 80 
O2 11 21 °308 0.935 72-30 
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291 
284 
Zol 
280 
277 
278 
276 
275 
274 
Efe 
270 
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